Linear Convergence of SVRG in Statistical Estimation
نویسندگان
چکیده
SVRG and its variants are among the state of art optimization algorithms for the large scale machine learning problem. It is well known that SVRG converges linearly when the objective function is strongly convex. However this setup does not include several important formulations such as Lasso, group Lasso, logistic regression, among others. In this paper, we prove that, for a class of statistical M-estimators where strong convexity does not hold, SVRG can solve the formulation with a linear convergence rate. Our analysis makes use of restricted strong convexity, under which we show that SVRG converges linearly to the fundamental statistical precision of the model, i.e., the difference between true unknown parameter θ∗ and the optimal solution θ̂ of the model. This improves previous convergence analysis on the non-strongly convex setup that achieves sub-linear convergence rate.
منابع مشابه
Stochastic Variance Reduction for Nonconvex Optimization
We study nonconvex finite-sum problems and analyze stochastic variance reduced gradient (Svrg) methods for them. Svrg and related methods have recently surged into prominence for convex optimization given their edge over stochastic gradient descent (Sgd); but their theoretical analysis almost exclusively assumes convexity. In contrast, we prove non-asymptotic rates of convergence (to stationary...
متن کاملLarger is Better: The Effect of Learning Rates Enjoyed by Stochastic Optimization with Progressive Variance Reduction
In this paper, we propose a simple variant of the original stochastic variance reduction gradient (SVRG) [1], where hereafter we refer to as the variance reduced stochastic gradient descent (VR-SGD). Different from the choices of the snapshot point and starting point in SVRG and its proximal variant, Prox-SVRG [2], the two vectors of each epoch in VRSGD are set to the average and last iterate o...
متن کاملLinear Wavelet-Based Estimation for Derivative of a Density under Random Censorship
In this paper we consider estimation of the derivative of a density based on wavelets methods using randomly right censored data. We extend the results regarding the asymptotic convergence rates due to Prakasa Rao (1996) and Chaubey et al. (2008) under random censorship model. Our treatment is facilitated by results of Stute (1995) and Li (2003) that enable us in demonstrating that the same con...
متن کاملBarzilai-Borwein Step Size for Stochastic Gradient Descent
One of the major issues in stochastic gradient descent (SGD) methods is how to choose an appropriate step size while running the algorithm. Since the traditional line search technique does not apply for stochastic optimization algorithms, the common practice in SGD is either to use a diminishing step size, or to tune a fixed step size by hand. Apparently, these two approaches can be time consum...
متن کاملVector Transport-Free SVRG with General Retraction for Riemannian Optimization: Complexity Analysis and Practical Implementation
In this paper, we propose a vector transport-free stochastic variance reduced gradient (SVRG) method with general retraction for empirical risk minimization over Riemannian manifold. Existing SVRG methods on manifold usually consider a specific retraction operation, and involve additional computational costs such as parallel transport or vector transport. The vector transport-free SVRG with gen...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1611.01957 شماره
صفحات -
تاریخ انتشار 2016